Picture for Zilong Zheng

Zilong Zheng

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

Add code
May 29, 2026
Viaarxiv icon

Xetrieval: Mechanistically Explaining Dense Retrieval

Add code
May 28, 2026
Viaarxiv icon

ESPIRE: A Diagnostic Benchmark for Embodied Spatial Reasoning of Vision-Language Models

Add code
Mar 13, 2026
Viaarxiv icon

\$OneMillion-Bench: How Far are Language Agents from Human Experts?

Add code
Mar 09, 2026
Viaarxiv icon

Credibility Governance: A Social Mechanism for Collective Self-Correction under Weak Truth Signals

Add code
Mar 03, 2026
Viaarxiv icon

The AI Hippocampus: How Far are We From Human Memory?

Add code
Jan 14, 2026
Viaarxiv icon

BEDA: Belief Estimation as Probabilistic Constraints for Performing Strategic Dialogue Acts

Add code
Dec 31, 2025
Viaarxiv icon

Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

Add code
Dec 19, 2025
Viaarxiv icon

Prismatic World Model: Learning Compositional Dynamics for Planning in Hybrid Systems

Add code
Dec 09, 2025
Viaarxiv icon

UltraVoice: Scaling Fine-Grained Style-Controlled Speech Conversations for Spoken Dialogue Models

Add code
Oct 26, 2025
Figure 1 for UltraVoice: Scaling Fine-Grained Style-Controlled Speech Conversations for Spoken Dialogue Models
Figure 2 for UltraVoice: Scaling Fine-Grained Style-Controlled Speech Conversations for Spoken Dialogue Models
Figure 3 for UltraVoice: Scaling Fine-Grained Style-Controlled Speech Conversations for Spoken Dialogue Models
Figure 4 for UltraVoice: Scaling Fine-Grained Style-Controlled Speech Conversations for Spoken Dialogue Models
Viaarxiv icon